Quality Assessment of Research Articles in Nuclear Medicine Using STARD and QUADAS-2 Tools

Objective(s): Diagnostic nuclear medicine is being increasingly employed in clinical practice with the advent of new technologies and radiopharmaceuticals. The report of the prevalence of a certain disease is important for assessing the quality of that article. Therefore, this study was performed to evaluate the quality of published nuclear medicine articles and determine the frequency of reporting the prevalence of studied diseases. Methods: We used Standards for Reporting of Diagnostic Accuracy (STARD) and Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) checklists for evaluating the quality of articles published in five nuclear medicine journals with the highest impact factors in 2012. The articles were retrieved from Scopus database and were selected and assessed independently by two nuclear medicine physicians. Decision concerning equivocal data was made by consensus between the reviewers. Results: The average STARD score was approximately 17 points, and the highest score was 17.19±2.38 obtained by the European Journal of Nuclear Medicine. QUADAS-2 tool showed that all journals had low bias regarding study population. The Journal of Nuclear Medicine had the highest score in terms of index test, reference standard, and time interval. Lack of clarity regarding the index test, reference standard, and time interval was frequently observed in all journals including Clinical Nuclear Medicine, in which 64% of the studies were unclear regarding the index test. Journal of Nuclear Cardiology had the highest number of articles with appropriate reference standard (83.3%), though it had the lowest frequency of reporting disease prevalence (zero reports). All five journals had the same STARD score, while index test, reference standard, and time interval were very unclear according to QUADAS-2 tool. Unfortunately, data were too limited to determine which journal had the lowest risk of bias. In fact, it is the author's responsibility to provide details of research methodology so that the reader can assess the quality of research articles. Conclusion: Five nuclear medicine journals with the highest impact factor were comparable in terms of STARD score, although they all showed lack of clarity regarding index test, reference standard, and time interval, according to QUADAS-2. The current data were too limited to determine the journal with the lowest bias. Thus, a comprehensive overview of the research methodology of each article is of paramount importance to enable the reader to assess the quality of articles.


Introduction
Nuclear medicine imaging as well as diagnostic radiological studies including computed tomography (CT) and magnetic resonance imaging (MRI) are important for patient management particularly for making an accurate diagnosis and staging/or restaging of a disease. Even though nuclear medicine studies tend to be less specific, compared to diagnostic radiological imaging, they mostly have high sensitivity, which makes them suitable for early diagnosis, staging, and restaging of diseases.
Some diagnostic nuclear medicine tests are helpful because of their high negative predictive value. Therefore, reporting the prevalence of a disease is essential for helping physicians make decisions based on test results.
The report of the prevalence of a certain disease or condition is very important in studies concerning diagnostic testing, since it affects the positive predictive value (PPV) of a diagnostic test. In fact, a test carried out in a population with a high prevalence of the disease would have a higher PPV, compared to a test performed in a population, where the disease occurrence is rare.
Various radioisotopes are used in nuclear medicine imaging studies so that the patient must receive an appropriate radiation dose. Physicians, who refer patients for nuclear medicine tests, as well as diagnostic radiology, should be concerned about the radiation risks for the patients.
Findings of many studies in diagnostic nuclear medicine provide new insights into this field and help clinicians make decisions for patient management. Standards for Reporting of Diagnostic Accuracy (STARD), as a wellestablished tool for assessing the value of diagnostic studies, has been adopted by many journals and can be found at www.stardstatement.org. If researchers report their study methods and findings according to STARD checklist, readers will be able to assess the validity of the publication.
Developed from Quality Assessment of Diagnostic Accuracy Studies (QUADAS) (1, 2), QUADAS-2 is used for the assessment of studies, which are planned to be included in the systematic reviews of Cochrane library. This tool focuses on the methodology of a study since the value of study results is dependent on methodology. This checklist assesses the presence of bias (high/low/unclear), although it does not appraise the results or discussion section.
Reference standard is of high significance in diagnostic studies. In clinical imaging studies, readers must be familiar with gold standards for each specific disease (such as histopathology report, angiogram, and culture). Some studies may use other imaging modalities, follow up with the same study or other methods.
Research articles published in journals with high impact factors usually have high quality; therefore, readers may be inclined to use the information of a certain article, based on the impact factor of the journal in which that article  is published. Reporting the prevalence of a specific disease, which is an essential part of diagnostic radiology and nuclear medicine research, is sometimes not included in some articles. This study was carried out to determine the frequency of reporting the prevalence of diseases in nuclear medicine articles. We determined the frequency of reporting disease prevalence in nuclear medicine journals with high impact factor, according to STARD stagnant title, each signal question in QUADAS-2 as well as the quality of reference standard.

Methods
The journals were sorted according to their impact factors in year 2012, provided by the website: www.medical-journals-links.com/ radiology-journals-nuclear-medicine-imaging.php. Then, original and clinical research articles were selected from the Journal of Nuclear Medicine, European Journal of Nuclear Medicine and Molecular Imaging, Clinical Nuclear Medicine, Journal of Nuclear Cardiology, and Nuclear Medicine Communications.
Diagnostic clinical studies, published in 2012, were included in the current study. However, studies with similar objectives and populations were excluded. We searched the articles in Scopus database and limited the results of each journal to studies published in 2012.
The search terms were limited to "sensitivity" and "specificity" in order to obtain comprehensive search results; then, articles consisting of diagnostic nuclear medicine tests were determined and included in the study. Two researchers read the abstracts of the articles separately and selected the diagnostic studies for further evaluation. In case of disagreement, the full article was read and a consensus between the two reviewers was reached.
To evaluate the quality of research articles, we assessed each article and recorded the results in a form including STARD, QUADAS-2, disease prevalence report, and a check list concerning the quality of reference standard.
Descriptive statistics were used to analyze and describe the results. Frequency of each STARD and QUADAS-2 item was reported and mean and standard deviation of STARD scores were also calculated. Analysis of the data was carried out using SPSS version 17, and graphs were generated by Microsoft Excel version 2007.

Results
Our search yielded 212 articles from 5 nuclear medicine journals, among which 101 articles were diagnostic studies. The number of the articles is presented in Figure 1.
Overall, the average STARD score was approximately 17±2.39 points, and the European Journal of Nuclear Medicine had the highest score (17.19±2.38). Some items in the STARD checklist were absent in all journals such as item No.13 (describing the methods for calculating test reproducibility, if done) and No. 24 (reporting the estimates of test reproducibility, if done). Item No. 20 (reporting any adverse events due to performing index tests or reference standard) was found only in the Journal of Nuclear Medicine and Nuclear Medicine Communications. The frequency and proportion of reporting each item in all journals are shown in Table 1, and the average scores are reported in Table 2.
According to QUADAS-2 checklist, there was a low risk of bias in many studies of all journals, regarding the study population; however, the index test and reference standard were highly unclear. The European Journal of Nuclear Medicine had the largest number of studies with a high risk of bias regarding reference standard. On the other hand, there was a low risk of bias

Methodology Bias
Concern Bias    The rate of reporting the prevalence of the studied diseases and reference standard are shown in Table 4. The journal with the most frequent reporting of disease prevalence was Nuclear Medicine Communications. The Journal of Nuclear Cardiology had the highest rate of appropriate reference standard (66.7%).

Discussion
All journals of clinical nuclear medicine showed similar scores of STARD. This may be related to the researchers' familiarity with STARD. Some items of STARD were not present or less frequently reported, e.g., reproducibility or adverse effect from the tests. This might be   due to the fact that all articles were clinical studies (it is not possible to repeat a certain test on the same subject). Therefore, the details about the adverse effects are not presented since they might be reported to the ethical committee. Although reporting confidence interval can be helpful for readers in decision-making process, according to statistical findings, the rate of such reports was low (only 42% in European Journal of Nuclear Medicine).
Another important item that should be reported in nearly all articles is describing whether or not the readers of the index test(s) and reference standard are blinded (masked) to the results; however, as the results indicated, the highest rate of reporting was 73%.
According to QUADAS-2 tool, all the studied journal articles were clear in terms of population and sample size. However, regarding index test, a high proportion of studies lacked clarity, e.g., 64% of the articles in Clinical Nuclear Medicine were unclear in this regard; it is not reported in the methodology section. There was also a high risk of bias because the cut-off point for diagnosis was not determined before the results of the standard test were known.
Regarding reference standard, some articles used other imaging modalities, which were not the true reference standard in the study. We found a high risk of bias in 23.1% of the articles in European Journal of Nuclear Medicine; the same was observed concerning time interval. This may be because the authors could not perform an invasive test or had to use more than one single reference standard; the main problem was a negative test.
All journal articles infrequently reported the disease prevalence, e.g., disease prevalence was mentioned only in 31% of the articles in Nuclear Medicine Communication. This may be related to patients' referral from different parts of the country to nuclear medicine centers; therefore, the authors could not report (or ignored to report) the rate of disease prevalence of a certain disease.
Concerning the reference standard, the articles in the Journal of Nuclear Cardiology used the appropriate reference standard (reported in 66.7% of the articles, since coronary artery catheter is the only reference standard for the diagnosis of coronary heart disease). For other diseases, authors used histopathology, culture, or angiograms to denote a positive finding in the index test. Regarding negative results, researchers might have used other imaging studies or clinical follow-ups as the reference standard.

Conclusion
The average score of STARD was similar among all five nuclear medicine journals. According to QUADAS-2, there was a low risk of bias in terms of study population. However, there was a lack of clarity in other parts including index test, reference standard, and time interval, due to insufficient reporting the details in the articles.
Overall, STARD can familiarize the readers with the details of methodology and results of a study, and help them decide on the study bias. By using QUADAS-2, readers can know the risk of bias for the methodology as low risk, high risk or unclear from this assessment tool; however, they would not be informed about the bias of the results. We suggest that readers use both tools in the assessment of diagnostic research articles.